Markov decision processes with delays and asynchronous cost collection

نویسندگان

  • Konstantinos V. Katsikopoulos
  • Sascha E. Engelbrecht
چکیده

Markov decision processes (MDPs) may involve three types of delays. First, state information, rather than being available instantaneously, may arrive with a delay (observation delay). Second, an action may take effect at a later decision stage rather than immediately (action delay). Third, the cost induced by an action may be collected after a number of stages (cost delay). We derive two results, one for constant and one for random delays, for reducing an MDP with delays to an MDP without delays, which differs only in the size of the state space. The results are based on the intuition that costs may be collected asynchronously, i.e., at a stage other than the one in which they are induced, as long as they are discounted properly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Markov Processes to the Machine Delays Analysis

Production and non-productive equipment and personnel delays are a critical element of any production system. The frequency and length of delays impact heavily on the production and economic efficiency of these systems. Machining processes in wood industry are particularly vulnerable to productive and non-productive delays. Whereas, traditional manufacturing industries usually operate on homoge...

متن کامل

A Formalism for Stochastic Decision Processes with Asynchronous Events

We present the generalized semi-Markov decision process (GSMDP) as a natural model for stochastic decision processes with asynchronous events in hope to spur interest in asynchronous models, often overlooked in AI literature.

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Time Delay and Data Dropout Compensation in Networked Control Systems Using Extended Kalman Filter

In networked control systems, time delay and data dropout can degrade the performance of the control system and even destabilize the system. In the present paper, the Extended Kalman filter is employed to compensate the effects of time delay and data dropout in feedforward and feedback paths of networked control systems. In the proposed method, the extended Kalman filter is used as an observer ...

متن کامل

Other Agents' Actions as Asynchronous Events

An individual planning agent does not generally have sufficient computational resources at its disposal to produce an optimal plan in a complex domain, as deliberation itself requires and consumes scarce resources. This problem is further exacerbated in a distributed planning context in which multiple, heterogeneous agents must expend a portion of their resource allotment on communication, nego...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Automat. Contr.

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2003